-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add embeddings for a few economics indicators #4971
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, just wondering if all the descriptions for each statistical variable was necessary or if we can clean it up and just have one good description per
@@ -3387,6 +3387,7 @@ sdg/ER_PTD_FRHWTR,Average proportion of Freshwater Key Biodiversity Areas covere | |||
sdg/ER_PTD_MTN,Average proportion of Mountain Key Biodiversity Areas covered by protected areas | |||
sdg/ER_PTD_TERR,Average proportion of Terrestrial Key Biodiversity Areas covered by protected areas | |||
sdg/ER_RSK_LST,Red List Index | |||
sdg/FP_CPI_TOTL_ZG,"Annual inflation, consumer prices;Annual inflation rate as measured by the consumer price index;Percentage change in the cost to the average consumer of acquiring a basket of goods and services that may be fixed or changed at specified intervals, such as yearly;Annual inflation rate consumer price index" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the first description "Annual inflation, consumer prices", the second description "annual inflation rate as measured by the consumer price index" and the last description "annual inflation rate consumer price index" are all quite similar, are they all needed? Last year when we did an overhaul on the embeddings, the goal was to have one good distinct description per variable, maybe 2 if necessary
@@ -3602,7 +3603,11 @@ worldBank/4_1_SHARE_RE_IN_ELECTRICITY,Renewable electricity share of total elect | |||
worldBank/EG_ELC_ACCS_RU_ZS,percentage of rural population with access to electricity | |||
worldBank/EG_ELC_ACCS_UR_ZS,percentage of urban population with access to electricity | |||
worldBank/EG_ELC_ACCS_ZS,percentage of population with access to electricity | |||
worldBank/FR_INR_DPST,"Deposit interest rate;Deposit interest rate is the rate paid by commercial or similar banks for demand, time, or savings deposits" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, not sure if the description of what a deposit interest rate is is necessary? the more sentences we add, the harder it is to have variables be in distinct embedding spaces
Add embeddings to base for a few economic indicators.
Attached reports with diffs below.
SV Index Differ.pdf
NL Eval Playground - Data Commons.pdf